AI News

see State of AI


Judea Pearl on what LLMs do

Reid Hoffman and Khosla talk:

Hoffman:

But Hoffman said that he’s not on board with Andreessen’s approach. “It’s kind of dumb to think that when you have major technologies there can’t be negative side effects,” he said, noting that all his AI projects have safety teams. “Tech can be amazing. Let’s be intentional about building.”

Khosla

He’s not buying existential risk, calling it “nonsensical” talk from academics who had nothing better to do. But he’s long on China risk, Khosla also gave a grudging endorsement of the Biden Executive Order, saying it was “okay.”

AI and Creativity

Columns.ai helps find datasets

Hugging Face Chat is an open-source competitor to ChatGPT. It’s not as good, obviously, but it correctly analyzed some R code and it gave the same unbiased caveats “As a large language model…” on questions about Donald Trump and Joe Biden)

Run Stable Diffusion on a Raspberry Pi

and GPU-Accelerated LLM on a $100 Orange Pi
> 100 Orange Pi 5 with Mali GPU, we achieve 2.5 tok/sec for Llama2-7b and 5 tok/sec for RedPajama-3b through Machine Learning Compilation (MLC) techniques. Additionally, we are able to run a Llama-2 13b model at 1.5 tok/sec on a 16GB version of the Orange Pi 5+ under $150. >

Matt Rickard explains why data matters less than it used to:

A lot of focus on LLM improvement is on model and dataset size. There’s some early evidence that LLMs can be greatly influenced by the data quality they are trained with. WizardLM, TinyStories, and phi-1 are some examples. Likewise, RLHF datasets also matter.

On the other hand, ~100 data points is enough for significant improvement in fine-tuning for output format and custom style. LLM researchers at Databricks, Meta, Spark, and Audible did some empirical analysis on how much data is needed to fine-tune. This amount of data is easy to create or curate manually.

Model distillation is real and simple to do. You can use LLMs to generate synthetic data to train or fine-tune your own LLM, and some of the knowledge will transfer over. This is only an issue if you expose the raw LLM to a counterparty (not so much if used internally), but that means that any data that isn’t especially unique can be copied easily.

But the authors at Generating Conversation explain OpenAI is too cheap to beat. Some back-of-the-envelope calculations show that it costs their company 8-20x more to do a model themselves than to use the OpenAI API, thanks to economies of scale.